ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / icon / newsgrp / group93c.txt / 000062_icon-group-sender _Mon Sep 20 14:57:32 1993.msg < prev next >

Wrap

Internet Message Format | 1994-02-02 | 2KB

Received: from owl.CS.Arizona.EDU by cheltenham.CS.Arizona.EDU; Mon, 20 Sep 1993 15:07:35 MST Received: by owl.cs.arizona.edu; Mon, 20 Sep 1993 15:07:34 MST Date: Mon, 20 Sep 93 14:57:32 -0400 From: ptho@seq1.loc.gov (Phillip Lee Thomas) Message-Id: <9309201857.AA05326@seq1.loc.gov> To: icon-group@cs.arizona.edu Subject: Yet another question... Status: R Errors-To: icon-group-errors@cs.arizona.edu About Chris Fagyal's text search problem... There is too little information in the problem to give an optimal answer. Problems are: 1) are all data lines of two lines each? 2) How big is the database? Brute force: Your index lines are totally numeric, so 1)match for a number in column one and see if you can convert the whole line into a number. If you can and the number is the one you are looking for, keep reading until the next index. Indexed: Keep a separate file of where each number is located and seek to that location. The index file could be kept in memory for faster access. Binary: Seek to the middle of your file, read until you get a number line. If the number is too big, seek half way between the start and the current position, etc. Be sure to test your boundaries so that you can retrieve the first and last index lines. Your main problems are over whether the entire file can be kept in memory or only the index or neither and whether you have some structure to the data lines. The icon library has some functions that might provide a model: see idxtext and associated procedures. Phillip Lee Thomas Library of Congress ptho@seq1.loc.gov